Serveur d'exploration sur Mozart

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Social Agents Playing a Periodical Policy

Identifieur interne : 002462 ( Main/Exploration ); précédent : 002461; suivant : 002463

Social Agents Playing a Periodical Policy

Auteurs : Ann Nowé [Belgique] ; Johan Parent [Belgique] ; Katja Verbeeck [Belgique]

Source :

RBID : ISTEX:B9569E9E27EC810019E7B082F8F573E317E9C04A

Abstract

Abstract: Coordination is an important issue in multiagent systems. Within the stochastic game framework this problem translates to policy learning in a joint action space. This technique however suffers some important drawbacks like the assumption of the existence of a unique Nash equilibrium and synchronicity, the need for central control, the cost of communication, etc. Moreover in general sum games it is not always clear which policies should be learned. Playing pure Nash equilibrium is often unfair to at least one of the players, while playing a mixed strategy doesn’t give any guarantee for coordination and usually results in a sub-optimal payoff for all agents. In this work we show the usefulness of periodical policies, which arise as a side effect of the fairness conditions used by the agents. We are interested in games which assume competition between the players, but where the overall performance can only be as good as the performance of the poorest player. Players are social distributed reinforcement learners, who have to learn to equalize their payoff. Our approach is illustrated on synchronous one-step games as well as on asynchronous job scheduling games.

Url:
DOI: 10.1007/3-540-44795-4_33


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct:series">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Social Agents Playing a Periodical Policy</title>
<author>
<name sortKey="Nowe, Ann" sort="Nowe, Ann" uniqKey="Nowe A" first="Ann" last="Nowé">Ann Nowé</name>
</author>
<author>
<name sortKey="Parent, Johan" sort="Parent, Johan" uniqKey="Parent J" first="Johan" last="Parent">Johan Parent</name>
</author>
<author>
<name sortKey="Verbeeck, Katja" sort="Verbeeck, Katja" uniqKey="Verbeeck K" first="Katja" last="Verbeeck">Katja Verbeeck</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:B9569E9E27EC810019E7B082F8F573E317E9C04A</idno>
<date when="2001" year="2001">2001</date>
<idno type="doi">10.1007/3-540-44795-4_33</idno>
<idno type="url">https://api.istex.fr/document/B9569E9E27EC810019E7B082F8F573E317E9C04A/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001674</idno>
<idno type="wicri:Area/Istex/Curation">001319</idno>
<idno type="wicri:Area/Istex/Checkpoint">001C99</idno>
<idno type="wicri:doubleKey">0302-9743:2001:Nowe A:social:agents:playing</idno>
<idno type="wicri:Area/Main/Merge">002515</idno>
<idno type="wicri:Area/Main/Curation">002462</idno>
<idno type="wicri:Area/Main/Exploration">002462</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Social Agents Playing a Periodical Policy</title>
<author>
<name sortKey="Nowe, Ann" sort="Nowe, Ann" uniqKey="Nowe A" first="Ann" last="Nowé">Ann Nowé</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Computational Modeling Lab</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Vrije Universiteit Brussel</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Belgique</country>
</affiliation>
</author>
<author>
<name sortKey="Parent, Johan" sort="Parent, Johan" uniqKey="Parent J" first="Johan" last="Parent">Johan Parent</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Computational Modeling Lab</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Vrije Universiteit Brussel</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Belgique</country>
</affiliation>
</author>
<author>
<name sortKey="Verbeeck, Katja" sort="Verbeeck, Katja" uniqKey="Verbeeck K" first="Katja" last="Verbeeck">Katja Verbeeck</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Computational Modeling Lab</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Vrije Universiteit Brussel</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Belgique</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2001</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">B9569E9E27EC810019E7B082F8F573E317E9C04A</idno>
<idno type="DOI">10.1007/3-540-44795-4_33</idno>
<idno type="ChapterID">Chap33</idno>
<idno type="ChapterID">33</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Coordination is an important issue in multiagent systems. Within the stochastic game framework this problem translates to policy learning in a joint action space. This technique however suffers some important drawbacks like the assumption of the existence of a unique Nash equilibrium and synchronicity, the need for central control, the cost of communication, etc. Moreover in general sum games it is not always clear which policies should be learned. Playing pure Nash equilibrium is often unfair to at least one of the players, while playing a mixed strategy doesn’t give any guarantee for coordination and usually results in a sub-optimal payoff for all agents. In this work we show the usefulness of periodical policies, which arise as a side effect of the fairness conditions used by the agents. We are interested in games which assume competition between the players, but where the overall performance can only be as good as the performance of the poorest player. Players are social distributed reinforcement learners, who have to learn to equalize their payoff. Our approach is illustrated on synchronous one-step games as well as on asynchronous job scheduling games.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Belgique</li>
</country>
</list>
<tree>
<country name="Belgique">
<noRegion>
<name sortKey="Nowe, Ann" sort="Nowe, Ann" uniqKey="Nowe A" first="Ann" last="Nowé">Ann Nowé</name>
</noRegion>
<name sortKey="Nowe, Ann" sort="Nowe, Ann" uniqKey="Nowe A" first="Ann" last="Nowé">Ann Nowé</name>
<name sortKey="Nowe, Ann" sort="Nowe, Ann" uniqKey="Nowe A" first="Ann" last="Nowé">Ann Nowé</name>
<name sortKey="Parent, Johan" sort="Parent, Johan" uniqKey="Parent J" first="Johan" last="Parent">Johan Parent</name>
<name sortKey="Parent, Johan" sort="Parent, Johan" uniqKey="Parent J" first="Johan" last="Parent">Johan Parent</name>
<name sortKey="Parent, Johan" sort="Parent, Johan" uniqKey="Parent J" first="Johan" last="Parent">Johan Parent</name>
<name sortKey="Verbeeck, Katja" sort="Verbeeck, Katja" uniqKey="Verbeeck K" first="Katja" last="Verbeeck">Katja Verbeeck</name>
<name sortKey="Verbeeck, Katja" sort="Verbeeck, Katja" uniqKey="Verbeeck K" first="Katja" last="Verbeeck">Katja Verbeeck</name>
<name sortKey="Verbeeck, Katja" sort="Verbeeck, Katja" uniqKey="Verbeeck K" first="Katja" last="Verbeeck">Katja Verbeeck</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Musique/explor/MozartV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002462 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002462 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Musique
   |area=    MozartV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:B9569E9E27EC810019E7B082F8F573E317E9C04A
   |texte=   Social Agents Playing a Periodical Policy
}}

Wicri

This area was generated with Dilib version V0.6.20.
Data generation: Sun Apr 10 15:06:14 2016. Site generation: Tue Feb 7 15:40:35 2023